Keep an eye on this space to stay updated with my project activities.
1. Project Overview and Scope 🔎
Istanbul is not only Turkey’s but also Europe’s most populous city. Therefore, if a disaster such as an earthquake occurs, the potential of destruction and the need for help would be very high. Due to the earthquake disaster that occurred at the beginning of 2023, the risky condition of buildings and residences in Turkey has become a matter of concern for people.
The purpose of this project is to examine Istanbul, which is almost certain to face a possible earthquake and significant destruction. The aim of the project is not only to assess buildings and residences in terms of their risky structural condition and make district-based interpretations but also to evaluate the opinions of people living in Istanbul and compare all the results.
7 datasets were used for this purpose, various graphs were drawn and interpreted comprehensively to make inferences.
Deprem Senaryosu Analiz Sonuçları is a dataset containing the results of an analysis made on an earthquake scenario that is predicted to have a magnitude of 7.5. Each row represents a neighbourhood. The columns have information mainly about the number of buildings based on damage level, the number of people based on vital status and pipe damages.
2017 Yılı Mahalle Bazlı Bina Sayıları is a dataset about the buildings. Each row represents a neighbourhood. The columns have information mainly about year of construction and number of floors of the buildings.
İstanbul Çevresinde Gerçekleşen Depremler is a dataset about the earthquakes around Istanbul within 1 year. Each row represents an earthquake. In the columns, there are details about the earthquakes.
2.3 Reason of Choice
Istanbul is a city where the risk of earthquake is said to be very high, and it is predicted to cause significant loss of life and property. Almost every day, warnings from experts about the earthquake, which is estimated to be between 7.2 and 7.6 in magnitude, can be seen in news and newspapers. I decided to work on such a current topic, to inform people through comprehensive analysis, to make them more aware, and to encourage taking steps against this disaster to get through with minimal damage.
2.4 Preprocessing
There are 7 datasets in “.csv”, “.xls” and “.xlsx” formats which were downloaded from the sources mentioned in the section 2.1 Data Source. All these datasets have been merged into the same Excel file called “deprem.xlsx” and organized for a better use in the analysis. Each dataset is placed on a different sheet. After that, spelling errors on each dataset were corrected via Excel. In other words, correction of wrong letters was done. Finally, the “deprem.xlsx” file was read and stored in .RData format using the code in the section 2.1 Data Source. The .RData file can be downloaded to review.
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
“deprem_senaryo” analysis:
The analysis began with the “deprem_senaryo” dataset. Initially, plots were created using data about the number of damaged buildings, the number of injured people and the number of pipeline damages. And then, the need for temporary shelter was assesed.
Although the data is based on neighborhoods, the analyses have been done based on districts.
“ilçe bazında hasarlı bina sayıları” (4 types of damage):
Bakırköy, Fatih and Küçükçekmece on the “Çok Ağır Hasarlı Binalar” plot,
Avıclar and Bakırköy on the “Ağır Hasarlı Binalar” plot,
Avcılar on the “Orta Hasarlı Binalar” and “Hafif Hasarlı Binalar” plot were attention-grabbing.
Code
p1<-deprem_senaryo%>%ggplot(aes(x=ilce_adi, y=cok_agir_hasarli_bina_sayisi))+geom_point()+theme(axis.text.x=element_text(angle=90, hjust=1))+xlab("İlçe Adı")+ylab("Çok Ağır Hasarlı Bina Sayısı")+labs(title="İlçe Bazında Çok Ağır Hasarlı Bina Sayısı Dağılımı")p1
Code
p2<-deprem_senaryo%>%ggplot(aes(x=ilce_adi, y=agir_hasarli_bina_sayisi))+geom_point()+theme(axis.text.x=element_text(angle=90, hjust=1))+xlab("İlçe Adı")+ylab("Ağır Hasarlı Bina Sayısı")+labs(title="İlçe Bazında Ağır Hasarlı Bina Sayısı Dağılımı")p2
Code
p3<-deprem_senaryo%>%ggplot(aes(x=ilce_adi, y=orta_hasarli_bina_sayisi))+geom_point()+theme(axis.text.x=element_text(angle=90, hjust=1))+xlab("İlçe Adı")+ylab("Orta Hasarlı Bina Sayısı")+labs(title="İlçe Bazında Orta Hasarlı Bina Sayısı Dağılımı")p3
Code
p4<-deprem_senaryo%>%ggplot(aes(x=ilce_adi, y=hafif_hasarli_bina_sayisi))+geom_point()+theme(axis.text.x=element_text(angle=90, hjust=1))+xlab("İlçe Adı")+ylab("Hafif Hasarlı Bina Sayısı")+labs(title="İlçe Bazında Hafif Hasarlı Bina Sayısı Dağılımı")p4
“kişi sayıları” (4 types):
Bahçelievler and Bakırköy on the “Can Kaybı” plot,
Bahçelievler, Bakırköy, Fatih and Küçükçekmece on the “Ağır Yaralı” plot,
Bahçelievler, Bakırköy and Küçükçekmece on the “Hastanede Tedavi” and “Hafif Yaralı” plot were attention-grabbing.
Code
p5<-deprem_senaryo%>%ggplot(aes(x=ilce_adi, y=can_kaybi_sayisi))+geom_point()+theme(axis.text.x=element_text(angle=90, hjust=1))+xlab("İlçe Adı")+ylab("Can Kaybı Sayısı")+labs(title="İlçe Bazında Can Kaybı Sayısı Dağılımı")p5
Code
p6<-deprem_senaryo%>%ggplot(aes(x=ilce_adi, y=agir_yarali_sayisi))+geom_point()+theme(axis.text.x=element_text(angle=90, hjust=1))+xlab("İlçe Adı")+ylab("Ağır Yaralı Sayısı")+labs(title="İlçe Bazında Ağır Yaralı Sayısı Dağılımı")p6
Code
p7<-deprem_senaryo%>%ggplot(aes(x=ilce_adi, y=hastanede_tedavi_sayisi))+geom_point()+theme(axis.text.x=element_text(angle=90, hjust=1))+xlab("İlçe Adı")+ylab("Hastanede Tedavi Sayısı")+labs(title="İlçe Bazında Hastanede Tedavi Sayısı Dağılımı")p7
Code
p8<-deprem_senaryo%>%ggplot(aes(x=ilce_adi, y=hafif_yarali_sayisi))+geom_point()+theme(axis.text.x=element_text(angle=90, hjust=1))+xlab("İlçe Adı")+ylab("Hafif Yaralı Sayısı")+labs(title="İlçe Bazında Hafif Yaralı Sayısı Dağılımı")p8
“boru hasarı” (3 types of damage):
Avcılar and Beylikdüzü on the “Doğalgaz Boru Hasarı” plot,
Beylikdüzü on the “İçme Suyu Boru Hasarı” plot,
Avcılar, Bakırköy, Beylikdüzü and Tuzla on the “Atık Su Boru Hasarı” plot were attention-grabbing.
“İlçe bazında hanelerdeki kentsel dönüşüm fikri hakkındaki bireysel öngörülerin dağılımı” (2 types of individual foresights):
In some districts, the red segments representing those who believe in the need of urban transformation are particularly noticeable.
Code
#veri çerçevesini uzun formata dönüştürmelibrary(tidyr)kentsel_donusum_fikri_duzenlenmis<-pivot_longer(kentsel_donusum_fikri, cols=starts_with("kentsel_donusum_ihtiyac_"), names_to="ongoru", values_to="deger")p16<-ggplot(kentsel_donusum_fikri_duzenlenmis, aes(x=ilce, y=deger, fill=ongoru))+geom_bar(stat="identity", position="stack")+theme(axis.text.x=element_text(angle=90, hjust=1))+xlab("İlçe") +ylab("Öngörü Sayısı") +labs(title="İlçe Bazında Hanelerdeki Kentsel Dönüşüm Fikri Hakkındaki Bireysel Öngörüler")p16
“ort_hane” analysis:
“İlçe Bazlı Ortalama Hanehalkı Büyüklüğü”:
In the analysis of the “ort_hane” dataset, districts have been ranked from highest to lowest average household size and it has been observed that the most crowded households are in Sultanbeyli.
Code
#ortalama hanehalkı büyüklüğüne göre büyükten küçüğe sıralamaort_hane2<-ort_hane%>%mutate(ilce=fct_reorder(ilce, ortalama_hanehalki_buyuklugu, .desc=TRUE))p17<-ggplot(ort_hane2, aes(x=ilce, y=ortalama_hanehalki_buyuklugu))+geom_bar(stat="identity")+theme(axis.text.x=element_text(angle=45, hjust=1))+xlab("İlçe Adı")+ylab("Ortalama Hanehalkı Büyüklüğü")+labs(title="İlçe Bazında Ortalama Hanehalkı Büyüklüğü")p17
We can see from the graph that in more than half of the districts in İstanbul, the average household size is more than 3.
“belediye_nufus” analysis:
“2019 Yılı Belediye Nüfusları” Grafiği:
In the analysis of the “belediye_nufus” dataset, municipalities have been ranked from highest to lowest population and it has been observed that the most crowded municipality is the Esenyurt Municipality.
Code
#ortalama hanehalkı büyüklüğüne göre büyükten küçüğe sıralamabelediye_nufus2<-belediye_nufus%>%mutate(belediye=fct_reorder(belediye, nufus_2019, .desc=TRUE))p18<-ggplot(belediye_nufus2, aes(x=belediye, y=nufus_2019))+geom_bar(stat="identity")+theme(axis.text.x=element_text(angle=45, hjust=1))+xlab("Belediye Adı")+ylab("Ortalama Hanehalkı Büyüklüğü")+labs(title="Belediye Bazında 2019 Yılı Belediye Nüfusları Büyüklüğü")p18
We can visualize the dataset titled “2019 Yılı Belediye Nüfusları”. We can see from the graph that 2 municipalities, Esenyurt and Küçükçekmece, are over 750.000. We can also see from the graph that the first one-third of the municipalities are over 500.000. From this, we can identify the most populous municipalities that would need the most help in an earthquake disaster.
“gerceklesen_deprem” analysis:
Analyzing this data could be crucial for examining earthquake activity and identifying potential risks. However, predicting future earthquake magnitudes is a complex thing to do because earthquakes are random and unpredictible.
“İstanbul Çevresinde Gerçekleşen Depremlerin Büyüklüklerinin Zaman Serisi Grafiği”:
Code
#time sütununu işlemegerceklesen_deprem$time<-as.POSIXct(gerceklesen_deprem$time, format="%Y-%m-%dT%H:%M:%OSZ")library(ggplot2)p19<-ggplot(gerceklesen_deprem, aes(x=time, y=magnitude))+geom_line()+labs(title="İstanbul Çevresinde Gerçekleşen Depremlerin Büyüklüklükleri", x="Tarih", y="Büyüklük")+theme_minimal()+scale_x_datetime(date_breaks="1 month", date_labels="%b %Y")+theme(axis.text.x=element_text(angle=90, vjust=0.5, hjust=1))p19
Code
p20<-ggplot(gerceklesen_deprem, aes(x=time, y=`depth/km`))+geom_line()+labs(title="İstanbul Çevresinde Gerçekleşen Depremlerin Derinlikleri", x="Tarih", y="Derinlik")+theme_minimal()+scale_x_datetime(date_breaks="1 month", date_labels="%b %Y")+theme(axis.text.x=element_text(angle=90, vjust=0.5, hjust=1))p20
What stands out in both graphs is the increase in earthquake frequencies after January 2020. This indicates that the danger is coming.
“İstanbul Çevresinde Gerçekleşen Depremlerin Harita Üzerinde Gösterimi”:
The created map allows us to observe whre earthquakes are concentrated.
District Based Overall Score Calculation, Sorting and Risk Analysis
A dataframe named “analiz” was constructed in order to contain the data to be used for calculating the overall risk score on a district basis. The rows represent the districts and the columns represent the factors. Each factor has a weight in order to calculate the overall risk score and the weights are located in the “weight” vector.
Code
#mahalle bazında veri içeren verisetlerinde ilçe bazında olacak şekilde düzeltmeler yapmadeprem_senaryo2<-deprem_senaryo%>%group_by(ilce_adi)%>%summarise(across(starts_with("cok_agir_hasarli_bina_sayisi"):starts_with("gecici_barinma"), sum))bina_sayi2<-bina_sayi%>%group_by(ilce_adi)%>%summarise(across(starts_with("1980_oncesi"):starts_with("9-19 kat_arasi"), sum))#"analiz" adlı bir dataframe oluşturmaanaliz<-data.frame(ilce_adi=deprem_senaryo2$ilce_adi,cok_agir_hasarli_bina_sayisi=deprem_senaryo2$cok_agir_hasarli_bina_sayisi,agir_hasarli_bina_sayisi=deprem_senaryo2$agir_hasarli_bina_sayisi,orta_hasarli_bina_sayisi=deprem_senaryo2$orta_hasarli_bina_sayisi,hafif_hasarli_bina_sayisi=deprem_senaryo2$hafif_hasarli_bina_sayisi,can_kaybi_sayisi=deprem_senaryo2$can_kaybi_sayisi,agir_yarali_sayisi=deprem_senaryo2$agir_yarali_sayisi,hastanede_tedavi_sayisi=deprem_senaryo2$hastanede_tedavi_sayisi,hafif_yarali_sayisi=deprem_senaryo2$hafif_yarali_sayisi,dogalgaz_boru_hasari=deprem_senaryo2$dogalgaz_boru_hasari,icme_suyu_boru_hasari=deprem_senaryo2$icme_suyu_boru_hasari,atik_su_boru_hasari=deprem_senaryo2$atik_su_boru_hasari,gecici_barinma=deprem_senaryo2$gecici_barinma,`1980_oncesi`=bina_sayi2$`1980_oncesi`,`1980-2000_arasi`=bina_sayi2$`1980-2000_arasi`,`2000_sonrasi`=bina_sayi2$`2000_sonrasi`,`1-4 kat_arasi`=bina_sayi2$`1-4 kat_arasi`,`5-9 kat_arasi`=bina_sayi2$`5-9 kat_arasi`,`9-19 kat_arasi`=bina_sayi2$`9-19 kat_arasi`,ortalama_hanehalki_buyuklugu=ort_hane$ortalama_hanehalki_buyuklugu)#"weight" adlı bir vektör oluşturmaweight<-c(0.09,0.07,0.05,0.03,0.09,0.07,0.05,0.03,0.05,0.05,0.05,0.05,0.08,0.05,0.03,0.03,0.05,0.08)#overall score hesaplamaoverall_risk_score<-analiz$cok_agir_hasarli_bina_sayisi*weight[1]+ analiz$agir_hasarli_bina_sayisi*weight[2]+ analiz$orta_hasarli_bina_sayisi*weight[3]+ analiz$hafif_hasarli_bina_sayisi*weight[4]+ analiz$can_kaybi_sayisi*weight[5]+ analiz$agir_yarali_sayisi*weight[6]+ analiz$hastanede_tedavi_sayisi*weight[7]+ analiz$hafif_yarali_sayisi*weight[8]+ analiz$dogalgaz_boru_hasari*weight[9]+ analiz$icme_suyu_boru_hasari*weight[10]+ analiz$atik_su_boru_hasari*weight[11]+ analiz$gecici_barinma*weight[12]+ analiz$X1980_oncesi*analiz$ortalama_hanehalki_buyuklugu*weight[13]+ analiz$X1980.2000_arasi*analiz$ortalama_hanehalki_buyuklugu*weight[14]+ analiz$X2000_sonrasi*analiz$ortalama_hanehalki_buyuklugu*weight[15]+ analiz$X1.4.kat_arasi*analiz$ortalama_hanehalki_buyuklugu*weight[16]+ analiz$X5.9.kat_arasi*analiz$ortalama_hanehalki_buyuklugu*weight[17]+ analiz$X9.19.kat_arasi*analiz$ortalama_hanehalki_buyuklugu*weight[18]latitude=c(40.8747,41.1864,40.9833,40.9792,41.0341,40.9977,40.9804,41.0837,41.0349,41.0441,41.1271,41.0133,41.0371,41.0248,41.1421,41.0323,41.0542,41.0412,41.0551,41.0203,41.0576,41.0105,40.9903,41.0717,40.8999,41.0092,40.9339,40.8796,41.0090,41.1664,41.1749,41.0737,41.0604,40.9684,41.1070,40.8144,41.0338,41.0327,40.9910)longitude=c(29.1294,28.7389,29.1278,28.7214,28.8330,28.8506,28.8724,28.8169,28.9122,29.0017,29.0978,28.6489,28.9774,28.5854,28.4575,29.1695,28.8676,28.6939,28.9346,28.9339,28.9153,28.8741,29.0205,28.9646,29.1936,28.7757,29.1650,29.2580,29.2109,29.0500,29.6096,28.2479,28.9878,29.2620,28.8714,29.3094,29.1013,29.0319,28.8968)overall_risk_score_df<-data.frame(ilce_adi=analiz$ilce_adi, latitude=latitude, longitude=longitude, overall_risk_score=overall_risk_score)
After the overall score calculation was completed, ranking was performed and the districts were ranked from highest risk to lowest risk. Finally, the overall scores were shown on the map.
As a result of these analyses, districts where precautions need to be taken, where the most loss of life and property could occur, and where potential interventions need to be well planned have been identified. These are Fatih, Bağcılar, Pendik, Küçükçekmece and Esenyurt. If the weights used in calculating the overall risk score are determined by experts, much more reliable and accurate results will be obtained.
4. Results and Key Takeaways 📈
Istanbul is Europe’s most populous city. It means that the potential of destruction and the need for help would be very high if a disaster such as an earthquake occurs.
The purpose of this project is to examine the risky structural condition, to make district-based interpretations and to evaluate the opinions of people living in Istanbul.
7 datasets were used in this study, plots were drawn and interpreted comprehensively.
This study has determined which districts are most likely to experience possible building and infrastructure damage in an earthquake disaster. This helps determine where, how and how much aid should be directed.
By calculating the overall risk scores, the districts of Istanbul have been ranked from the most risky to least risky. Thanks to his ranking, the government will be able to take appropriate steps both before and after an earthquake, aiding in preparedness and response efforts.
Lastly, this study serves as both a guide and an encouragement for the society to be prepared for this disaster.
This study has
identified the regions most prone to earthquakes and the cities most at risk,
depicted them on maps,
drawn conclusions about the opinions of survey participants and